Large numbers of explanatory variables, a semi-descriptive analysis.
نویسندگان
چکیده
Data with a relatively small number of study individuals and a very large number of potential explanatory features arise particularly, but by no means only, in genomics. A powerful method of analysis, the lasso [Tibshirani R (1996) J Roy Stat Soc B 58:267-288], takes account of an assumed sparsity of effects, that is, that most of the features are nugatory. Standard criteria for model fitting, such as the method of least squares, are modified by imposing a penalty for each explanatory variable used. There results a single model, leaving open the possibility that other sparse choices of explanatory features fit virtually equally well. The method suggested in this paper aims to specify simple models that are essentially equally effective, leaving detailed interpretation to the specifics of the particular study. The method hinges on the ability to make initially a very large number of separate analyses, allowing each explanatory feature to be assessed in combination with many other such features. Further stages allow the assessment of more complex patterns such as nonlinear and interactive dependences. The method has formal similarities to so-called partially balanced incomplete block designs introduced 80 years ago [Yates F (1936) J Agric Sci 26:424-455] for the study of large-scale plant breeding trials. The emphasis in this paper is strongly on exploratory analysis; the more formal statistical properties obtained under idealized assumptions will be reported separately.
منابع مشابه
MARCINKIEWICZ-TYPE STRONG LAW OF LARGE NUMBERS FOR DOUBLE ARRAYS OF NEGATIVELY DEPENDENT RANDOM VARIABLES
In the following work we present a proof for the strong law of large numbers for pairwise negatively dependent random variables which relaxes the usual assumption of pairwise independence. Let be a double sequence of pairwise negatively dependent random variables. If for all non-negative real numbers t and , for 1 < p < 2, then we prove that (1). In addition, it also converges to 0 in ....
متن کاملLaws of Large Numbers for Random Linear
The computational solution of large scale linear programming problems contains various difficulties. One of the difficulties is to ensure numerical stability. There is another difficulty of a different nature, namely the original data, contains errors as well. In this paper, we show that the effect of the random errors in the original data has a diminishing tendency for the optimal value as the...
متن کاملON THE LAWS OF LARGE NUMBERS FOR DEPENDENT RANDOM VARIABLES
In this paper, we extend and generalize some recent results on the strong laws of large numbers (SLLN) for pairwise independent random variables [3]. No assumption is made concerning the existence of independence among the random variables (henceforth r.v.’s). Also Chandra’s result on Cesàro uniformly integrable r.v.’s is extended.
متن کاملA Note on the Strong Law of Large Numbers
Petrov (1996) proved the connection between general moment conditions and the applicability of the strong law of large numbers to a sequence of pairwise independent and identically distributed random variables. This note examines this connection to a sequence of pairwise negative quadrant dependent (NQD) and identically distributed random variables. As a consequence of the main theorem ...
متن کاملSOME PROBABILISTIC INEQUALITIES FOR FUZZY RANDOM VARIABLES
In this paper, the concepts of positive dependence and linearlypositive quadrant dependence are introduced for fuzzy random variables. Also,an inequality is obtained for partial sums of linearly positive quadrant depen-dent fuzzy random variables. Moreover, a weak law of large numbers is estab-lished for linearly positive quadrant dependent fuzzy random variables. Weextend some well known inequ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Proceedings of the National Academy of Sciences of the United States of America
دوره 114 32 شماره
صفحات -
تاریخ انتشار 2017